September 14, 2015

The main ingredient

  • A policy simulator makes sense only as long as the agents are adaptive
  • Main decision: where do I go fishing?
  • This is currently done in two ways:
    • Multi-Logit Discrete Choice Models
    • Dynamic Programming

Complications

  • Maps are large!
  • The biomass distribution is unknown
  • The biomass distribution changes when interacted with
  • Other agents interact with biomass at the same time

Epsilon-greedy for adaptation

  • Write once, use always
  • Judge trips by their \(\frac {\Pi}{t}\)
  • 80% of the time explore, 20% of the time exploit
  • When exploiting, fish at the seatile that has been most profitable so far
  • When exploring, fish at a random seatile neighboring the best one you know

A simple problem

Drawing

Finding the right spot

With a little help

  • We can create a social network linking fishers
  • They can exchange information about the best spot
  • When "exploiting" a fisher can go to his own best spot or to the one of his best friends.

A simple run

Oil Prices

Fish the line (part 1)

Fish the line (part 2)

By-catch (part 1)

By-catch (part 2)

Changing Gear

Changing Gear (part 2)

The limits of feedback

  • Trial and Error is versatile and can solve multiple problems at once
  • Sometimes you can guess the error without trying
  • Sometimes imitation is hard

Policy Simulation

ITQ Reservation Prices

TAC vs ITQ Different Gear

TAC vs ITQ Gas Efficiency